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Networked industrial control system has analyzer that employs rules -based 
determination and statistical analysis to distribute controller-based 
resources to industrial controllers 
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Patent Details 
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US 7058712 Bl EN 17 8 

Alerting Abstract US Bl 

NOVELTY - The analyzer has classification models to learn data 
patterns related to queried resources, for generating probabilities that 
predict possible future data patterns. The analyzer employs rules -based 
determination, algorithmic determinations, statistical analysis and 
inference analysis to distribute the controller-based resources to the 
industrial controllers. 

DESCRIPTION - The analyzer comprises classification models e.g. 
support vector machines (SVM) , Naive Bayes, Bayes Net, decision tree, 
similarity-based and vector-based models to learn data pattern related to 
queried resources. The analyzer employs statistical analysis such as 
averaging, standard deviations, comparisons, sampling, frequency and 
periodicity determinations to distribute the controller-based resources to 
the industrial controller. The analyzer also uses general probabilistic 
estimate to determine a performance condition given monitored evidence of 
an input pattern. The estimate is stated as Pr(CpEl - E 

J), where 'Pr' is a probability, *Cp f relates to a monitored performance 
condition given evidence, N E f relating to differences from monitored 
patterns and s J f is an integer. The evidence includes consistency data with 
a previous pattern to predict likely future outcomes. The distribution 
engines and associated drivers propagate controller-based resources between 
industrial controllers and the remote system e.g. computer, workstation, 
communication module, input /output device and network device. 

USE - For controlling industrial processes, manufacturing equipment, 
factory automation using internet. 

ADVANTAGE - The analyzer transforms XML data to other protocols to 
facilitate more efficient processing of data required from other sources. A 
flexible application distribution framework is used to support distributed 
processing and configuration within industrial controller environment. 
Application or component distribution testing and configuration is 
automated in accordance with coordinated component interactions to improve 
overall performance. 

DESCRIPTION OF DRAWINGS - The figure shows the flow diagram explaining 
the resource processing and distribution in industrial controller 
environment . 
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Alerting Abstract WO Al 

NOVELTY - An integrated spectral data processing, data mining, and 
modeling system includes a general purpose computer system; machine 
readable storage medium containing a set of instructions for the computer 
system; and a tracking database containing the results of the model 
building, visualization, analysis, and/or prediction of the data. The 
instructions integrate modules into an integrated spectral data processing 
system. 

DESCRIPTION - An integrated spectral data processing, data mining, and 
modeling system includes a general purpose computer system; machine 
readable storage medium containing a set of instructions for the computer 
system; and a tracking database containing the results of the model 
building, visualization, analysis, and/or prediction of the data. The 
instructions integrate modules into an integrated spectral data processing 
system. The modules include a module (A) operating on raw data from files 
created by the analytical spectrographic instrument and storing raw 
processed data in a file; a module (B) operating on the raw processed data 
and containing instructions for providing data standardization of the raw 
data and storing standardized individualized spectral data in a file and/or 
a library of files; a module (C) operating on the standardized 
individualized spectral data and containing instructions for reducing the 
individualized spectral data into a modeling form and storing the modeling 
form of the data in a file; and a module (D) operative on the data reduced 
to modeling form and containing instructions providing a user of the 
system with tools for performing model building, visualization, analysis 



and/or prediction of the data. 

USE - Used in diverse screening and biomarker discovery applications and 
in conjunction with an analytical spectrographic instrument collecting data 
from a chemical or biological sample. 

ADVANTAGE - The system provides for automated processing of raw spectral 
data, data standardization, reduction to data to modeling form, an 
unsupervised and supervised model building, visualization, analysis, and 
prediction. It enables the user to perform visual data mining, statistical 
analysis and feature extraction. 

DESCRIPTION OF DRAWINGS - The figure is a labeled flow chart illustrating 
the software module. 
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Alerting Abstract US Al 

NOVELTY - Development of a multi-tiered calibration model involves 
classifying spectral measurements using a calibration set, into previously 



defined classes based on prior information of the subject. The calibration 
set comprises a data set of exemplar spectral measurements from a 
representative sampling of a subject population. 

DESCRIPTION - Development of a multi-tiered calibration model involves: 
classifying spectral measurements (I) using a calibration set (a) into 
previously defined classes (b) using prior information of the subject; 
classifying the measured spectrum (c) into (b) using instrumental 
measurements at a tissue measurement site; and extracting features from 
(c) for further classification, (a) Comprises a data set of exemplar (I) 
from a representative sampling of a subject population. A decision rule 
makes a class assignments. 

USE - For developing a multi-tiered calibration model for estimating 
concentration of a target blood analyte from measured tissue spectra 
(claimed) . 

ADVANTAGE - The method localizes calibration and sample spectra into 
local groups that are used to reduce variation in sample spectra due to 
co-variation of spectral interf erents , sample heterogeneity, state 
variation and structural variation. The method provides measurement spectra 
which are associated with localized calibration models that are designed to 
produce the most accurate estimates for the patient at the time of 
measurement. The method avoids modeling differences between patients and 
thus can be generalized to more individuals. 
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Alerting Abstract WO A2 

NOVELTY - Improving (Ml) the prediction of biological data, comprises: 

1. inputting data relating to one or more biomolecules into a universal 
database; 

2 . identifying at least one grouping of data to be obtained from the 
database; 

3. inputting at least one data grouping into a neural network program; and 



4. adapting a software program including initial design rules to the 
new design rules. 

DESCRIPTION - Improving (Ml) the prediction of biological data, 
comprises : 

1. inputting data relating to one or more biomolecules into a universal 
database; 



2 . identifying at least one grouping of data to be obtained from the 
universal database; 

3. inputting at least one data grouping into a neural network program, 
which can analyze the data and generate new design rules for predicting 
biological data; and 

4. adapting a software program including initial design rules to the 
new design rules. 

INDEPENDENT CLAIMS are also included for the following: 

l.A computer based system for prediction of biological data, comprising: 



1. a database for storing and retrieval of data relating to 
biomolecules; 

2. a design software program comprising initial design rules, the design 
software program being configured to be capable of adapting to new 
design rules; and 

3. a neural network program, which can analyze data stored in the 
database and generate new design rules; 

2. A computer related method (M2) for analysis of biological data, 
comprising: 

1. generating data from one or more biomolecules; 
2. inputting data into a universal database; 

3 . identifying at least one grouping of data to be mined from the 
universal database; 

4. inputting at least one data grouping into a design software program, 
the design software program comprising of design rules; and 

5. inputting an output of the design software program to a neural 
network program which can analyze the data and generate new design 
rules, which new rules can be inputted into the design program; 

3. A computer based method (M3) comprising generating data from one or 
more biological molecules and utilizing a neural network to manipulate 
the data; 

4. A computer based system for analysis of data, comprising: 

1. An analysis substrate; 

2. A database; 

3. A design software program comprising design rules; and 

4. A neural network program which can analyze data of the design 
software program; 

5. An automated method (M4) for analysis of biological data, comprising 
inputting data into a computer program which comprises a first mode 
that provides a training condition , and a second mode that provides a 
question and answer condition . 



USE - The new method (Ml), is useful for improving the prediction of 
biological data (claimed) . The methods, systems and processes of the 
present invention are useful for organizing expression of information 
relating to biomolecules in a way that facilitated data mining, designing 
of capture probes and primers, and designing microarrays for use in 
biological research . 
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Determination of phonological rules in speech recognition system - combines 
pairs of prototype clusters that exhibit statistical differences of less 
than threshold value to generate new prototype cluster 
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Alerting Abstract EP A 

The generation method comprises the steps of processing a training text 
and vocalizations representing the training text to obtain a number of 
samples representing the language components of the vocalizations and 
selecting, form among the samples, a set of samples representing respective 

instances of a selected language component in the vocalization. Each of 
the selected samples is annotated with an indicator of one language 
component in a contextual relationship with the selected sample. 

There is a step of generating, from the futher annotated selected 
samples, a decision tree that separates the selected samples into 
respectively different leaf groups based on the context indicators, each of 
the leaf groups representing a pronunciation of the selected language 
component in a respectively difference context. The annotated selected 
samples are grouped into a set of clusters, each cluster representing a 
respectively different pronunciation of the selected language component. 

USE - Recognises continuously spoken words. 

Equivalent Alerting Abstract US A 

A continuous speech recognition system includes an automatic 
phonological rules generator which determines variations in the 
pronunciation of phonemes based on the context in which they occur. 

This phonological rules generator associates sequences of labels derived 
from vocalisations of a training text with respective phonemes inferred 
from the training text. 

These sequences are then annotated with their phoneme context from the 
training text and clustered into groups representing similar pronunciations 
of each phoneme. 

A decision tree is generated using the context information of the 



sequences to predict the clusters to which the sequences belong. The 
training data is processed by the decision tree to divide the sequences 
into leaf-groups representing similar pronunciations of each phoneme. 

The sequences in each leaf-group are clustered into sub-groups 
representing respectively different pronunciations of their corresponding 
phoneme in a given context. A Markov model is generated for each 
sub-group. 

The various Markov models of a leaf-group are combined into a single 
compound model by assigning common initial and final states to each 
model . The compound Markov models are used by a speech recognition 
system to analyse an unknown sequence of labels given its context. 

(40pp) 
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Computerized text classifier system, has modeling engine calculating 
set of match scores for concept model by using knowledge base, where each 
score has associated category with suggested action 
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NOVELTY - The system has a pre-processor to analyze text to identify 
concepts and generate a concept model containing identified concepts. An 
adaptive knowledge base (118) has a set of learning nodes, each provided 
with statistical information corresponding to a category. A modeling engine 
(116) calculates a set of match scores for the model by using the knowledge 
base. Each score has a category with a suggested action. 

DESCRIPTION - An INDEPENDENT CLAIM is also included for a method of 
classifying text on a computer. 

USE - Used for classifying text on a computer. 

ADVANTAGE - The statistical engine learns and adapts with every 
relationship event that it sees, thus maintaining a high level of accuracy 
over time. 

DESCRIPTION OF DRAWINGS - The drawing shows a block diagram of an 
electronic communication system. 

100 Electronic communication management system 

112 Contact center 

114 Universal data model 

116 Modeling engine 

118 Adaptive knowledge base 

120 Data access services 
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Claims : 

...concepts and generate a concept model containing the identified 
concepts; a knowledge base having a plurality of nodes including a set 
of learning nodes, each of the learning nodes being provided with 
statistical . . . 
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Product information merging method for web-based transactions, involves 
generating Naive-Bayes classifier using text and attributes associated with 
product information in new hierarchy 
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Alerting Abstract US Al 

NOVELTY - A Naive-Bayes classifier is generated using text and attributes 
associated with the product information in a new hierarchy. The product 
information in the new hierarchy, is associated with nodes in main 
hierarchy corresponding to highest classification probability for product, 
using the classifier. 

DESCRIPTION - INDEPENDENT CLAIMS are included for the following: 

1. Computer system; and 

2. Computer readable storage device storing product information merging 
program. 

USE - For merging product information in different hierarchies, in 
web-based transactions. 

ADVANTAGE - The products or product information in the new hierarchy is 
merged with product information in main hierarchy easily using simple 
technique . 

DESCRIPTION OF DRAWINGS - The figure shows the flowchart illustrating the 
computer implemented product information merging process. 
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Original Abstracts : 



...for merging product information from a first hierarchy into a second 
hierarchy. A Naive Bayes classification model is generated using both 
text data and attribute (numerical) data pertaining to products in the... 

...for merging product information from a first hierarchy into a second 
hierarchy. A Naive Bayes classification model is generated using both 
text data and attribute (numerical) data pertaining to products in the... 
Claims : 

...and using the classifier, associating at least some product information 
in the first hierarchy with nodes in the second hierarchy. . . 

...and using the classifier, associating at least some product information 
in the first hierarchy with nodes in the second hierarchy, wherein the 
generating act includes multiplying at least one probability based at least 
partially. . . 
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Bandwidth allocation method for packet telecommunications system - mapping 
packet network flow based on individual instances of traffic objects, to 
defined traffic classes arbitrarily assigned by off-line manager 
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Alerting Abstract WO A2 

The method for classifying packet network flows involves applying 
individual instances of traffic objects, i.e. packet network flows, to a 
classification model based on selectable information obtained from a 
number of layers of a multi-layered communication protocol. 

The flow is then mapped to the defined traffic classes, which are 
arbitrarily assignable by an off-line manager which creates the 
classification. The classification need not be a complete enumeration of 
the possible traffic. 

USE - Managing flow bandwidth utilisation at network, transport and 
application layers in store and forward network, for classifying packet 
network flows for use in determining policy or rule of assignment of 
service level, and enforcing policy by direct rate control. 

ADVANTAGE - Allows classification of traffic according to definable set 
of classification attributes selected by manager, including subset of 
traffic of interest to be inserted. 
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...method comprises applying individual instances of traffic objects, i.e. 
packet network flows to a classification model based on selectable 
information obtained from a plurality of layers of a multi-layered 
communication. . . 

...method comprises applying individual instances of traffic objects, i.e. 
packet network flows to a classification model based on selectable 
information obtained from a plurality of layers of a multi-layered 
communication. . . 

...method comprises applying individual instances of traffic objects, i.e. 
packet network flows to a classification model based on selectable 
information obtained from a plurality of layers of a multi-layered 
communication. . . 
Claims : 

...specification of the parsing step to a plurality of 

hierarchically-recognized classes represented by a plurality of nodes , 
each node having a traffic specification and a mask, according to the 
mask; thereupon, having found a... 

...said flow specification with one of said plurality of 
hierarchically-recognized classes represented by a plurality nodes ; 
andallocating bandwidth resources according to a policy associated with 
said class by allocating a... 

...the flow specification of the parsing step to a plurality of 
hierarchically-recognized classes represented by a plurality of nodes 
, each node having a traffic specification and a mask, according to the 
mask; thereupon, having . . . 

...associating said flow specification with one class of said plurality of 
hierarchically-recognized classes represented by a plurality nodes; 
andallocating bandwidth resources according to a policy associated with 
said class. 
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State probability approximation method for modeling probabilistic systems 
in Markov networks uses belief propagation and message passing scheme 
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NOVELTY - Modelling method includes linked nodes that represent states 
of system parts and each link represents statistical dependencies between 
possible states of related nodes. Nodes are grouped into clusters and based 
on size, messages are associated with sets of source and destination nodes 
and a message dependent rule and on selected links connecting nodes. 
Values of messages are updated (321) until a termination condition is 
reached. 

DESCRIPTION - When a termination state is reached, the probabilities of 
the states of the system are determined from the values of the messages. 

An INDEPENDENT CLAIM is also included for a method that determines 
approximate probabilities of states of a system represented by a model . 

USE - For a method to approximate both the marginal probabilities and 
maximum a posteriori probability (MAP) states in Markov networks with 
loops . 

ADVANTAGE - The method gives more accurate answers for marginal 
probabilities and it can converge to a single answer in cases where a 
recursive method does not. 

DESCRIPTION OF DRAWINGS - The figure represents a flow diagram of a 
method for propagating beliefs in a network. 

321 Messages update rules 
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Alerting Abstract WO Al 

NOVELTY - Method consists in running a test program on a target component 
and finding its nodes, monitoring system performance at nodes, collecting 
its performance data at the nodes over a set period, determining the causal 
relationships between the nodes, and comparing them with a model . The 
system model is then modified and if a causal relationship fails to match 
it is added to the model and the administrator is alerted. The data is 
converted into Boolean values corresponding to performance threshold 
conditions for averaging. 

DESCRIPTION - There are INDEPENDENT CLAIMS for (1) an IT system and (2) a 
computer program. 

USE - Method is for iteratively determining complex IT systems component 
associations . 

ADVANTAGE - Method enables automatic determination of the causal 
relationships between various subsystems and elements of complex networks, 
accumulates data, reduces the human intervention required and analyzes 
system performances with Boolean attributes. 

DESCRIPTION OF DRAWINGS - The drawing shows a block diagram of a system 
for adaptive system management. 
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Fingerprint image data processing - determining in accordance with rule 
based classification technique for most probable ones of local pattern 
types for identified region of interest 
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The method involves prescreening the data to determine locations of 
regions of interest in the fingerprint data. E.g. for a determined location 
of a region of interest, it requires extracting a set of feature vectors 
within an area that includes the determined location. The extracted set of 
feature vectors is then applied to a number of input nodes of a multi 
-layer neural network. From an output of the multi-layer neural network, it 
entails determining a set of probabilities that the set of feature 
vectors represent individual ones of a plurality of predetermined local 
pattern types 

In accordance with a rule based classification technique, it allows 
determining from most probable ones of the local pattern types for the 
identified regions of interest. 

USE/ADVANTAGE - In learning computer program which examines fingerprints 
for detection of local pattern forms in region of interest. Robust in 
nature, model based, while capable of handling and quickly learning by 
using fingerprint data which is statistical in nature. 
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Generation of rock classifications corresponding to depths in wellbore 
involves receiving value indicative of percent dry weight of total 
carbonate, total quartz-feldspar-mica, total clay, and values corresponding 
to depth in wellbore 
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NOVELTY - Rock classification corresponding to depths in a wellbore is 
generated by receiving a value indicative of percent dry weight of total 
carbonate, percent dry weight of total quartz-feldspar-mica (QFM) , percent 
dry weight of total clay for each depths in the wellbore, and values 
received corresponding to depth in the wellbore. 

DESCRIPTION - Generation of rock classifications corresponding to depths 
in a wellbore involves receiving a value indicative of percent dry weight 
of total carbonate, percent dry weight of total quartz-feldspar-mica, 
percent dry weight of total clay for each depths in the wellbore, and 
values received corresponding to depth in the wellbore; comparing a 
particular values at a particular depth in the wellbore with rules 
corresponding to rock classifications in a rule base; determining a 
particular rock classification that corresponds to rules in the rule base 
when a match is found between the particular values or rules in the rule 
base, and associating the particular values at particular depth in the 
wellbore with the particular rock classification. 

An INDEPENDENT CLAIM is also included for a program storage device 
readable for generating rock classifications corresponding to depths in a 
wellbore comprising program of instructions executable through the machine. 

USE - For generating rock classifications corresponding to depths in a 
wellbore . 

ADVANTAGE - The novel method uses the strengths of the spectroscopy tools 
in general and borehole imaging tools, and enhances the subsurface 
geological description and interpretation. 

DESCRIPTION OF DRAWINGS - The figure illustrates a computer system that 
stores the ternary diagram model for generating rock classification as 
above, corresponding to data points that are output from the elemental 
capture spectroscopy sonde. 
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define software applications. 



USE - Used for defining software application in business process 



management in commercial company. 
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one of the sub process models, slots, data models, sub data model and flow 
rules, where the flow rules connect a pair of slots, data models and the 
sub data models, thereby effectively avoiding the need for writing of 
source code for developing software applications. 

DESCRIPTION OF DRAWINGS - The drawing shows a screen-shot illustration of 
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Detailed Description 

Detailed Description 

... rk) and W2(rk) according to (2) and (3) (in 

case of in-ofln system update all the m rule nodes with 
the highest Al activation) . 

Apply aggregation procedure of rule nodes after each group... 

...complexity of the data used for their training; (ii) they can perform 
both clustering and classification /prediction; (iii) models can be 
adapted on new data without the need to be retrained on old data... 
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match category, and the definition of the category 
is updated in accordance with a learning rule to include 
the contribution from the new training pattern, If the 
correlation is below the threshold, a... 

. . . features or views of 

the subjects. For example, if the system is used to 
visually classify automobiles by model , each model of 
automobile would be a separate class. Specific 
recognizable features of the automobiles... 
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Claims 

Detailed Description 

provides a computer program product comprising a machine readable 
medium on which is provided program instructions for creating a 
biological classification model for classifying the effect of stimuli on 
biological systems . . . 

...signatures representing the multivariate response of stimuli at various 
levels; selecting a collection of biological features to be used in a 
proposed model; computing distances between the stimulus response paths 
in a biological feature space defined by the biological features 
selected; characterizing the proposed model based on how well it groups 
stimulus response paths into the known classifications of the associated 
stimuli in the biological features space; repeating the selecting, 
computing, and characterizing for a plurality of selected collections of 
biological features ; and choosing a proposed model as the biological 



classification model based on the characterizations made. In one 
embodiment of the computer program product the collection of biological 
features selected by the instructions may comprise one or more of 
morphological details, texture measures for a marker, intensity measures 
for a marker, statistical details, and values derived from any of the 
foregoing of a cell or cell population. In another embodiment of the 
computer program product the instructions for computing the distances 
may comprise instructions for computing an angle or an inner products 
of vectors, each from a common center point in the biological feature 
space, wherein a first vector passes through one of the signatures or 
points of a . . . 

.of a second stimulus response path. In another embodiment of the 
computer program product the instructions for computing the distances 
may comprise instructions for computing Euclidean distances. 

[0027] In another aspect the invention provides a method of determining 
aim 

40 A computer program product comprising a machine readable medium on 
which is provided program instructions for creating a biological 
classification model for classifying the effect of stimuli on biological 
systems . . . 

. representing 

the multivariate response of stimuli at various levels; 
(b) selecting a collection of biological features to be used in a 
proposed model; (c) computing distances between the stimulus response 
paths in a biological feature 

space defined by the biological- features selected in (b) ; 

(d) characterizing the proposed model based on how well it groups 
stimulus response paths into the known classifications of the associated 
stimuli in the biological features 

space; 

(e) repeating (b) - (d) for a plurality of selected collections of 
biological features ; and (f) choosing a proposed model as the 
biological classification model based on the 
characterizations made in (d) . 

41 The computer program product of claim 40, wherein the collection of 
biological features selected by the instructions in (b) comprise one 
or more of morphological details, texture measures for a marker, 
intensity measures for a marker, statistical details, and values 
derived from any of the foregoing of a cell or cell 

population . 

42 The computer program product of claim 40, wherein the instructions 
for computing the distances in (c) comprise instructions for computing 
an angle or an inner products of vectors, each from a common center point 
in the biological feature space, wherein a first vector passes through 
one of the signatures or points of a... 

.a second stimulus response path. 

43 The computer program product of claim 40, wherein the instructions 
for computing the distances in (c) comprise instructions for computing 
Euclidean distances. 

44 A method of determining a separation distance between response paths 
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Detailed Description 
. . . GENOMIC APPLICATIONS 
FIELD OF THE 4 INVENTION 

The field of this invention is the application of classification tree 
models incorporating Bayesian analysis to the statistical prediction 
of binary outcomes especially in clinical, genomic and medical 
applications . 

BACKGROUND OF THE INVENTION 

Bayesian analysis is an approach to statistical analy8is that is 
based on the Bayeg's law, which states that the posterior probability... 



...the latter attempts to establish confidence intervals around parameters, 
and/or falsify apriori null-hypotheses, & Bayesian approach . attempts to 
keep track of how a-priori expectations about some-phenomenon of interest 



. .more measurement variables and one variable that determines that class 
of the sample, Various splitting rules have been used; however, the 
success of the predictive ability varies considerably as data sets... 
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... utilising Evolving Connectionsist Systems (ECOS) techniques have the 
following advantages when compared with the traditional statistical and 
neural network techniques: (i) they have a flexible structure that 
reflects the complexity of the data used for their training; (ii) they 
perform both clustering and classification /prediction; (iii) the 
models can be adapted on new data without the need to be retrained on 
old data; (iv) they can be used to extract rules (profiles) of 
different sub-classes of samples. The rules (profiles) are fuzzy with 
some statistical coefficients attached. 



It is therefore an object of the present invention to provide a method 
for determining a relationship between gene expression data and one or 
more conditions or prognostic outcome, or at least to provide the 
public with a useful choice. 

SUMMARY . . . 

.input layer comprising one or more input nodes configured to receive 
gene expression data; a rule base layer comprising one or more rule 
nodes; an output layer comprising one or more output nodes configured to 
output one or more conditions; and an adaptive component configured to 
extract one or more rules from the rule base layer representing 
relationships between the gene expression data and the one or more 
conditions . . . 



14/3, K/10 (Item 9 from file: 349) 

DIALOG (R) File 34 9:PCT FULLTEXT 

(c) 2006 WIPO/Univentio. All rts . reserv. 



00857259 **Image available** 

SYSTEM AND METHOD FOR AUTOMATICALLY CLASSIFYING TEXT 
PROCEDE ET SYSTEME DE CLASSIFICATION AUTOMAT I QUE DE TEXTE 

Patent Applicant /Assignee : 

KANISA INC, 19925 Stevens Creek Blvd., Suite 150, Cupertino, CA 95014, US 
, US (Residence), US (Nationality) 
Inventor (s) : 

UKRAINCZYK Igor, 69 Olive Court, Mountain View, CA 94041, US, 
COPPERMAN Max, 233 Sunset Avenue, Santa Cruz, CA 95060, US, 
HUFFMAN Scott B, 195 Opal Avenue, Redwood City, CA 94062, US, 

Legal Representative: 

VIKSNINS Ann S (et al) (agent) , Schwegman, Lundberg, Woessner & Kluth, 
P.O. Box 2938, Minneaplois, MN 55402, US, 

Patent and Priority Information (Country, Number, Date) : 

Patent: WO 200190921 A2-A3 20011129 (WO 0190921) 

Application: WO 2001US16872 20010525 (PCT/WO US0116872) 

Priority Application: US 2000206975 20000525 

Designated States: 

(Protection type is "patent" unless otherwise stated - for applications 
prior to 2004) 

AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EC 
EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS 
LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ 
TM TR TT TZ UA UG UZ VN YU ZA ZW 

(EP) AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR 

(OA) BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG 

(AP) GH GM KE LS MW MZ SD SL SZ TZ UG ZW 

(EA) AM AZ BY KG KZ MD RU TJ TM 
Publication Language: English 
Filing Language: English 
Fulltext Word Count: 14320 



Fulltext Availability: 
Detailed Description 



Detailed Description 

... when such teclrmiques are applied to large volumes of text. 

[08] Another approach is rule -based text classification systems which 
classify documents according to rules written by people about the 
relationship between WO 01/90921 PCT/USOl/1 6872 categories. Once 
developed, these statistical models may be used to classify new 
documents. In systems that do utilize a learning... 
...categorized training data or correctly categorized training data with 
extraneous or unusual vocabulary degrades the statistical model , 
causing the resulting classifier to perform poorly. 

[10] Of the prior art systems that utilize training data, most do... 

...utilize user input, do not allow users to directly affect the quantified 
relationship between vocabulary features and classification categories, 
but simply allow the user to change the training data. Yet another... 
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... platforms that may be used to practice the invention. Thus, the 
particular functions of the rule engine, policy management engine, 
scoring server, order management workstation, and so forth may be 
provided. . . 

...a combination of hardware and software, as described, or entirely in 
hardware elements. Also, the statistical 
75 

model may be implemented in a variety of modes, including a neural 
network, a multivariate regression model, or any other model that 
classifies inputs based on statistical analysis of historical 
exemplars. The particular capj talization or naming of the modules, 
protocols, features , attributes, data structures., or any other aspect 
is not mandatory or significant, and the mechanisms that implement the 
invention or its features may have different names or formats; likewise 
the details of the specific data structures, messages, and APIs may be 
changed without departing from the features and operations of the 
invention. Finally, it should be noted that the language used in... 
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tissue classification according to chemical and structural properties 
that employs NIR spectral measurements. A tissue classification model 
is developed by taking NIR spectral absorbance measurements from an 
exemplary population of individuals. 

The spectral measurements are assessed to identify features of interest 
most likely 1 5 to represent variation between tissue types. Statistical 
and analytical techniques are used to enhance the features of interest 
and extract those features representing variation within a tissue. A 
classification routine determines the best model to define classes... 

...the variation within a class is small compared to the variation between 
classes. A decision rule assigns class membership to individual members 
of the representative population based on the structural and chemical 
properties of each individual's tissue. 



The disclosed tissue classification model is applied in a 
non-invasive, in-vivo tissue classification procedure using NIR spectral 
measurements to classify individual tissue samples. The classification 



model defines classes and provides a set of exemplary data that enable 
the segregation of test ... DETAILED DESCRIPTION 

Various features of biological tissue can be measured using NIR 
spectroscopy because these features often have unique signatures in the 
NIR wavelength region (700 to 2500nm) as a result of their absorbance and 
scattering properties. Many of these features vary according to tissue 
type and are thus useful for classifying tissue into separate types. 
Useful features that can be measured using NIR absorbance and 
scattering patterns include, but are not limited. . . 
...fraction of blood in tissue, spectral characteristics related to 

environmental influences, and hematocrit levels. The features that vary 
according to tissue type may be isolated from tissue sample spectra using 

statistical techniques and can then be used to classify the sample 
accordingly. 

DEVELOPMENT OF A TISSUE CLASSIFICATION MODEL 

A non-invasive, in-vivo method for the classification of tissue samples 
according to chemical, physiological, and structural differences is 
described herein. The classification model employs the use of NIR 
measurements to quantify chemical, structural, or physiological 
properties of the... 

...1 provides a flow 1 5 diagram of a general procedure used to develop a 
classification model . In general, the algorithm for developing a 
classification model comprises the following steps. 

1 . Providing exemplary NIR measurements (1 1) 

2. Spectral feature selection (12) 

3. Feature enhancement (1 3) 

4. Feature extraction (14) 

5. Factor selection (1 5) 

6. Classification calibration (16) 

7. Application of a Decision Rule (17) 

8. Assignment to a group (1 8) 
MEASUREMENT 

NIR measurements (1 1) are first... in the art can appreciate other 
methods of classification are readily applicable. 

1 5 DECISION RULE 

A decision rule (17) is developed to determine to which class a sample 
belongs. The criterion the decision rule employs to determine the class 
membership of the sample is whether the sample's projection... 

. . .the 

mean of the two population means (see R. Johnson and D. Wichern Applied 
Multivariate Statistical Analysis, 3'. ed., Prentice-Hall, New Jersey 
(1992)). The scalar L is compared with L. . . 

...is assigned to population two (1 8). 

CLASSIFICATION OF TISSUE SAMPLES 

Implementation of the disclosed classification model for 
classification of actual tissue samples is described in detail in the 
parent application to the current... 

. ..Ruchti. In general, the steps of a procedure for tissue classification 
are. 

. NIR measurements 

2. Feature extraction 



3. Pattern classification 

4. Assignment of class membership 

A set of absorbance values pertaining... 
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... representation of the pattern classification system. The system has 
two general functions. 

The extraction of features , and 

* The classification of the features according to a classification 
model and decision rule . 

Feature extraction 25 is any mathematical transformation that enhances 
a particular aspect or quality of the data that is useful for 
interpretation. The classification model 30 is a method for 
determining a set of similarity measures with the predefined classes. The 
decision rule is the assignment of class membership 32 on the basis of 
a set of measures... 

...Wiley and Sons, New York (1973); and J. Schurmann, Pattern 

Classification. A Unified View of Statistical and Neural Approaches, 
John Wiley & Sons, Inc., New York (1996)). 



Within this framework, two different. 
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Research on machine learning has taken numerous different 
directions. The present study focussed on the micro-structural 
characteristics of learning system. It was postulated that learning systems 
consist of a macro-structure which controls the flow of information, and a 
micro-structure which manipulates information for decision making. A review 
of the literature suggested that the basic function of the micro-structure 
of learning systems was to make a choice among a set of alternatives. This 
decision function as then equated with the task of making classification 
decisions. On the basis of the requirements for practical learning systems, 
the feature frequency approach was chosen for model development. An 
analysis of the feature frequency approach indicated that an effective 
model must be sensitive to both within-dimension and between-category 
variations in frequencies. A model was then developed to provide for such 
sensitivities. The model was based on the Bayes 1 Theorem with an 
assumption of uniform prior probability of occurrence for the categories. 
This model was tested using data collected for neuropsychological diagnosis 
of children. Results of the tests showed that the model was capable of 
learning and provided a satisfactory level of performance. The performance 
of the model was compared with that of other models designed for the same 
purpose. The other models included NEXSYS, a rule -based system 
specially designed for this type of diagnosis, discriminant analysis, which 
is a statistical technique widely used for pattern recognition, and 
neural networks, which attempt to simulate the neural activities of the 
brain. Results of the tests showed that the model ! s performance was 
comparable to that of the other models. Further analysis indicated that the 
model has certain advantages in that it has a simple structure, is capable 
of explaining its decisions, and is more efficient than the other models. 
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Inhibitory effects in artificial neural networks have usually been 
achieved via direct inhibitory connections between competing nodes. This 
mechanism is limited by the large number of inhibitory connections that are 
sometimes necessary, and the difficulty of designing competitive 
interactions with inhibitory connections for some applications. Because of 
these limitations competitive activation mechanisms have been introduced to 
provide competitive interactions between nodes using strictly excitatory 
connections. Competitive activation mechanisms have been successfully 
applied to problems in Al, cognitive modeling, and computational 
neuroscience, sometimes producing effects which are difficult or impossible 
to achieve with noncompetitive activation mechanisms. However, applications 
using competitive activation mechanisms have been limited by the absence of 
effective learning methods. 

This dissertation develops the first unsupervised learning method for 
artificial neural networks using competitive activation mechanisms. The 
learning method, a variant of competitive learning, is shown to be 
effective through both computer simulations and mathematical analysis. 
Competitive learning can be used for classification tasks involving the 
separation of input pattern clusters; analysis shows that a typical 
competitive activation model produces a different classification than a 
typical noncompetitive activation model using competitive learning. The 
unsupervised competitive learning rule is extended to include 
reinforced and supervised versions which are also shown to function 
effectively . 

Competitive learning has been used successfully with noncompetitive 
activation mechanisms in the past for feature map formation in many 
applications (speech recognition, robotic control, optimization, brain 
modelling, etc.). Computer simulations show that competitive activation 
models can also produce computational map formation with different 
structural characteristics than comparable noncompetitive activation 
models. Further, competitive activation models can generate more rapid and 
extensive map reorganization following network damage than noncompetitive 
activation models. Competitive activation models also support topographic 
map formation/refinement and map reorganization in response to changes in 
the structure of the input stimuli. Evaluating topographic map formation 
necessitated the development of new measurement and plotting techniques 
which are presented here. This work shows that competitive learning using 
competitive activation mechanisms is a powerful approach for artificial 
neural networks. 
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Abstract: We present Cartesian granule feature , a new multidimensional 
feature formed over the cross product of fuzzy partition labels. 
Traditional fuzzy modelling approaches, mainly use flat features (one 
dimensional features ) and, consequently suffer from decomposition error 
when modelling systems where there are dependencies between the input 
variables. Cartesian granule features help reduce (if not eliminate) the 
error due to the decompositional usage of features . In the approach taken 
here, we label the (fuzzy) subsets which partition the various universes 
and incorporate these labels in the form of Cartesian granules into our 
modelling process. Fuzzy sets defined in terms of these Cartesian granules, 
are extracted automatically from statistical data using the theory of 
mass assignments, and are incorporated into fuzzy rules . Consequently 
we not only compute with words, we also model with words. Due to the 
interpolative nature of fuzzy sets, this approach can be used to model 
both classification and prediction problems. Overall Cartesian granule 
features incorporated into fuzzy rules yield glass-box models and 

when demonstrated on the ellipse classification problem yields a 
classification accuracy of 98%, outperforming standard modelling approaches 
such as neural networks and the data browser. (21 Refs) 
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Abstract: A computer processing method was developed to objectively 
classify disease in the lower limb arteries evaluated by noninvasive 
ultrasonic duplex scanning. This method analyzes Doppler blood flow 
signals, extracts diagnostic features from Doppler spectrograms and 
classifies the severity of the disease into three categories of diameter 
reduction (0-19%, 20-49% and 50-99%) . The features investigated were 
based on frequency features obtained at peak systole, spectral broadening 
indices and normalized amplitudes of the power spectrogram computed in 
various positive and negative frequency bands. A total of 379 arterial 
segments studied from the aorta^to the popliteal artery were classified 
using a pattern recognition method based on the Bayes model . Two 
classification schemes using a two - node decision rule were tested. 
Both schemes gave similar results, the first one provided an overall 
accuracy of 83% (Kappa=0.42) and the second an overall accuracy of 81% 
(Kappa=0.35) when compared with conventional biplane contrast 
arteriography. These performances, especially for the 0 to 19% lesion 
category, are better than the one obtained by the technologist 
(accuracy=76% and Kappa=0.33), based on visual interpretation of the 
Doppler spectrograms. (38 Refs) 
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